Applying Natural Language Processing Techniques for Effective Persian- English Cross-Language Information Retrieval
نویسنده
چکیده
Much attention has recently been paid to natural language processing in information storage and retrieval. This paper describes how the application of natural language processing (NLP) techniques can enhance cross-language information retrieval (CLIR). Using a semi-experimental technique, we took Farsi queries to retrieve relevant documents in English. For translating Persian queries, we used a bilingual machinereadable dictionary. NLP techniques such as tokenization, morphological analysis and part of speech tagging were used in pre-andpost translation phases. Results showed that applying NLP techniques yields more effective CLIR performance.
منابع مشابه
Applying Light Natural Language Processing to Ad-Hoc Cross Language Information Retrieval
In the CLEF 2005 Ad-Hoc Track we experimented with language-specific morphosyntactic processing and light Natural Language Processing (NLP) for the retrieval of Bulgarian, French, Italian, English and Greek.
متن کاملSolving the Polysemy Problem of Persian Words Using Mutual Information Statistics
In recent years, large monolingual, comparable and parallel corpora have played a very crucial role in solving various problems of computational linguistics including machine translation, information retrieval, natural language processing, and the like. This paper tries to solve the problem of polysemy of Persian words while translating them into Persian by the computer. We use Mutual Informati...
متن کاملGerman, French, English and Persian Retrieval Experiments at CLEF 2009
We describe evaluation experiments conducted by submitting retrieval runs for the monolingual German, French, English and Persian (Farsi) information retrieval tasks of the Ad Hoc Track of the Cross-Language Evaluation Forum (CLEF) 2009. In the ad hoc retrieval tasks, the system was given 50 natural language queries, and the goal was to find all of the relevant records or documents (with high p...
متن کاملAutomatic Term Extraction for Cross-Language Information Retrieval Using a Bilingual Parallel Corpus
Information retrieval is a crucial area of natural language processing (NLP) and can be defined as finding documents whose content is relevant to the query need of a user. Cross-language information retrieval refers to a kind of information retriev/al in which the language of the query and that of searched document are different. This paper tries to construct a bilingual lexicon from an English...
متن کاملGerman, French, English and Persian Retrieval Experiments at CLEF 2008
We describe evaluation experiments conducted by submitting retrieval runs for the monolingual German, French, English and Persian (Farsi) information retrieval tasks of the Ad-Hoc Track of the Cross-Language Evaluation Forum (CLEF) 2008. In the ad hoc retrieval tasks, the system was given 50 natural language queries, and the goal was to find all of the relevant records or documents (with high p...
متن کامل